Semantic Audio-Visual Data Fusion for Automatic Emotion Recognition

نویسنده

  • Dragos Datcu
چکیده

The paper describes a novel technique for the recognition of emotions from multimodal data. We focus on the recognition of the six prototypic emotions. The results from the facial expression recognition and from the emotion recognition from speech are combined using a bi-modal multimodal semantic data fusion model that determines the most probable emotion of the subject. Two types of models based on geometric face features for facial expression recognition are being used, depending on the presence or absence of speech. In our approach we define an algorithm that is robust to changes of face shape that occur during regular speech. The influence of phoneme generation on the face shape during speech is removed by using features that are only related to the eyes and the eyebrows. The paper includes results from testing the presented models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evidence Theory-Based Multimodal Emotion Recognition

Automatic recognition of human affective states is still a largely unexplored and challenging topic. Even more issues arise when dealing with variable quality of the inputs or aiming for real-time, unconstrained, and person independent scenarios. In this paper, we explore audio-visual multimodal emotion recognition. We present SAMMI, a framework designed to extract real-time emotion appraisals ...

متن کامل

Audio-Visual Spontaneous Emotion Recognition

Automatic multimodal recognition of spontaneous emotional expressions is a largely unexplored and challenging problem. In this paper, we explore audio-visual emotion recognition in a realistic human conversation setting—the Adult Attachment Interview (AAI). Based on the assumption that facial expression and vocal expression are at the same coarse affective states, positive and negative emotion ...

متن کامل

Emotion recognition from speech using prosodic features

Emotion recognition, a key step of affective computing, is the process of decoding an embedded emotional message from human communication signals, e.g. visual, audio, and/or other physiological cues. It is well-known that speech is the main channel for human communication and thus vital in the signalling of emotion and semantic cues for the correct interpretation of contexts. In the verbal chan...

متن کامل

Audio-Visual Emotion Recognition Using Semi-Coupled HMM and Error-Weighted Classifier Combination

This paper presents an approach to automatic recognition of emotional states from audio-visual bimodal signals using semi-coupled hidden Markov model and error weighted classifier combination for Human-Computer Interaction (HCI). The proposed model combines a simplified state-based bimodal alignment strategy and a Bayesian classifier weighting scheme to obtain the optimal solution for audio-vis...

متن کامل

The First Audio/Visual Emotion Challenge and Workshop - An Introduction

The Audio/Visual Emotion Challenge and Workshop (http://sspnet. eu/avec2011) is the first competition event aimed at comparison of automatic audio, visual, and audiovisual emotion analysis. The goal of the challenge is to provide a common benchmark test set for individual multimodal information processing and to bring together the audio and video emotion recognition communities, to compare the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008